Methods to Enhance Transformation in Near Real Time ETL
نویسندگان
چکیده
During the transformation phase of near real time ETL there could be some technique applied so that we get better results in terms of speed and accuracy. Transformation phase concentrates on changing the transactional data into semantically suitable format for the data warehouse. We try to bring in some of the solution during transformation phase that could enhance the speed and accuracy of the phase like advanced query optimization techniques, designing a new workflow so that we could reschedule some of the task. E.g. some functions applied on two parallel flows could be applied only once if the flows are converging. Also we look into some of the solutions for stream data how we could merge stream data and stored data, the challenges like speed and memory utilization. We also explore solutions like event based transformation for selected items, and handling of metadata efficiently so that it could add valued to the transformation phase.
منابع مشابه
Near-real-time Parallel Etl+q for Automatic Scalability in Bigdata
In this paper we investigate the problem of providing scalability to near-real-time ETL+Q (Extract, transform, load and querying) process of data warehouses. In general, data loading, transformation and integration are heavy tasks that are performed only periodically during small fixed time windows. We propose an approach to enable the automatic scalability and freshness of any data warehouse a...
متن کاملContainer-Managed ETL Applications for Integrating Data in Near Real-Time
As the analytical capabilities and applications of e-business systems expand, providing real-time access to critical business performance indicators to improve the speed and effectiveness of business operations has become crucial. The monitoring of business activities requires focused, yet incremental enterprise application integration (EAI) efforts and balancing information requirements in rea...
متن کاملIntegrating Data in near Real-time
As the analytical capabilities and applications of e-business systems expand, providing real-time access to critical business performance indicators to improve the speed and effectiveness of business operations has become crucial. The monitoring of business activities requires focused, yet incremental enterprise application integration (EAI) efforts and balancing information requirements in rea...
متن کاملEfficient ETL+Q for Automatic Scalability in Big or Small Data Scenarios
In this paper, we investigate the problem of providing scalability to data Extraction, Transformation, Load and Querying (ETL+Q) process of data warehouses. In general, data loading, transformation and integration are heavy tasks that are performed only periodically. Parallel architectures and mechanisms are able to optimize the ETL process by speedingup each part of the pipeline process as mor...
متن کاملStriving towards Near Real-Time Data Integration for Data Warehouses
The amount of information available to large-scale enterprises is growing rapidly. While operational systems are designed to meet well-specified (short) response time requirements, the focus of data warehouses is generally the strategic analysis of business data integrated from heterogeneous source systems. The decision making process in traditional data warehouse environments is often delayed ...
متن کامل